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Abstract. In this paper histograms of user ratings for movies (1*, . . . , 10*) are analysed. The evolving 
stabilised shapes of histograms follow the rule that all are either double- or triple-peaked. Moreover, at 
most one peak can be on the central bins 2*, • ■ • , 9* and the distribution in these bins looks smooth 
'Gaussian-like' while changes at the extremes (1* and 10*) often look abrupt. It is shown that this is 
well approximated under the assumption that histograms are confined and discretised probability density 
functions of Levy skew a-stable distributions. These distributions are the only stable distributions which 
could emerge due to a generalized central limit theorem from averaging of various independent random 
variables as which one can see the initial opinions of users. Averaging is also an appropriate assumption 
about the social process which underlies the process of continuous opinion formation. Surprisingly, not 
the normal distribution achieves the best fit over histograms obseved on the web, but distributions with 
fat tails which decay as power-laws with exponent — (1 + a) (a — |). The scale and skewness parameters 
of the Levy skew a-stable distributions seem to depend on the deviation from an average movie (with 
mean about 7.6*). The histogram of such an average movie has no skewness and is the most narrow 
one. If a movie deviates from average the distribution gets broader and skew. The skewness pronounces 
the deviation. This is used to construct a one parameter fit which gives some evidence of universality in 
processes of continuous opinion dynamics about taste. 
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1 Introduction 



'.20.Hh World Wide Web, Internet 



.75. Da Systems obeying scaling laws 



Are there universal laws underlying the dynamics of opin- 
ion formation? 

Understanding opinion formation is tackled classically 
by social psychologists and sociologists with experiments 
(see e.g. (Q]; El; B EE [1; If ; E)), bu t also by the social simu 



lfl llll; UJ; 113|; El)) and sociophysics 
171)) communities. Often studies are 



lation (see e.; 
(see surveys 

either empirical but on small experimental samples or con- 
trary they analyse models analytically or by simulation 
but without empirical validation. Both restricts the pos- 
sibility to draw conclusions on universality in real world 
opinion formation. This is to a large extent due to the 
difficulties in getting large scale data on human opinions. 
But this situation changes rapidly nowadays thanks to 
the world wide web. The existence of rating modules is al- 
most ubiquitous. (In the meantime the ubiquity of ratings 
has raised the question how to standardise rating modules 

This paper is an attempt to exploit rating data to ex- 
tract universal properties in opinion formation processes. 
Specifically, the focus here is on opinions about the quality 
of movies, as expressed by users on movie rating sites. Rat- 
ings stand as a proxy for any opinion related to taste which 
is one-dimensional and of a continuous nature ('continu- 
ous' means expressible as a real number and also gradually 



adjustable (at least to some extent)). Apparently, possible 
user ratings are discrete (1* (awful), . . ., 10* (excellent)), 
but the continuous nature (in the sense of ordered num- 
bers) is also obvious. 

Thus, this paper is not about discrete opinion dynam- 
ics without a continuous nature (like e.g. with respect to 
decision: 'yes' or 'no') as often studied in physics because 
of the analogy to spin systems. This paper is also not on 
multidimensional many-faceted opinions (as e.g. (fl9l; Hih) 
but on issues which are broken down to one variable: the 
quality of a movie. It is also important to distinguish the 
type of opinion. Movie ratings are about taste. There is 
no true value as for example in issues of fact-finding about 
an unknown quantity. Further on, there is no real physical 
constraint for opinions. It is always possible to like a movie 
more than someone else. This is for example not the case 
in opinions about budgeting in the political realm, where 
opinions have to be within certain bounds. Finally, taste 
differs from issues about negotiations where there is a clear 
incentive of agreeing on a common value (as e.g. for prices 
in trade, or forming a politcal party in political issues). 
In issues of taste there is nevertheless a weaker force to 
adjust towards the opinions of peers, e.g. for normative 
reasons (Td like to like what my peers like.'). But there 
might also be a force to adjust away from the opinions of 
others to pronounce individuality. 
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User ratings on the world wide web have already been 
subject of research. Dellarocas (|20l ) sketches their role for 
digitising Word-of-Mouth (with the main focus on reputa- 
tion mechanisms). Ratings play a key role in some recom- 
mendation algorithms, see Goldberg et al (f2lh. Cheung et 
al (22), and Umayarov et al |23t ) which work by compar- 
ing the rating profiles of different users. They also play 
a role in a recent method of pricing an option on movie 
revenues, see Chance et al ((24). Salganik et al (d) study 
the emerging popularity of songs measured by downloads 
under the impact of the visibility of the number of down- 
loads. They used ratings to check if liking corresponds to 
downloads, which is the case. But which movie gets pop- 
ular is to some extend arbitrary. 

Jiang and Chen ((251 ) argue economically that the im- 
plementation of online rating systems can enhance con- 
sumer surplus, vendor profitability and social welfare. But 
they also argue, that this could work better in a monopo- 
listic market than a duopolistic market. 

Cosley et al (author?) ff26h checked how users re-rate 
movies especially if they are confronted with a prediction 
of the quality (like the mean of other ratings) . They found 
a tendency to adjust towards the presented prediction. 
They also show that users rate quite consistently when 
they re-rate on other scales (like 5* compared to 10*). 

Li and Hitt |27t ) analysed the time evolution of the 
user reviews arriving. (A review is a text but it is accom- 
panied by a rating is assigned by the writer.) They present 
an economic model where the utility of a product for a 
user is determined by individual search attributes which 
are known before purchase and individual quality which 
can only be checked after purchase. Both attributes are 
heterogeneous across the population and purchasing deci- 
sions are made with respect to expected quality. Expecta- 
tions can be influenced by user reviews. Positive reviews 
of early adopters produce high average ratings and thus 
too high expected quality. This triggers purchases of other 
consumers which then get disappointed and write bad re- 
views. If individual search attributes towards a product 
are positively correlated with individual quality then this 
may imply a declining trend of reviews. This is called pos- 
itive self selection bias. Negative correlations imply neg- 
ative self selection bias and thus an increasing trend of 
reviews. These trends are confirmed empirically by book 
review data on amazon . com with the majority of products 
(70%) showing positive self selection. 

The phenomenon of declining average votes has been 
explained in a different way in (28). They argue from the 
point of view of the writer in front of a computer. Writing 
a review is costly (in terms of time) and writers want 
to impact the average vote. While the average vote over 
all books is more positive one can only make a difference 
with a negative review, so writers with a positive attitude 
hesitate to write a review. (If there are already a lot, so 
why write another?) They also emphasize that internet 
reviews do not show a group polarization effect which is 
known to appear in small groups discussing in the same 
physical room([5j). 



There are few studies on characterising the empirical 
distributions of ratings. In (29) histograms of user-ratings 
(on 5*-scale in movies.yahoo.com) are characterised as 
U-shaped, while professional critics have a single-peaked 
usage of the votes (peak is at 4). Other studies concentrate 
either on user profile comparison or only on the average 
vote and how it could impact further votes and sales. In 
models idiosyncratic opinions are very often thought to 
be normal distributed (0; H HI) ■ In the model of (H3) 
the beta distribution is used which lives on a bounded 
interval. 

Normal distribution, Beta distribution, and U-shape 
all do not coincide with the observation of rating his- 
tograms studied which are very often triple peaked. In the 
following, the idea is introduced that a rating of a user is 
derived from an originally continuous opinion from the 
whole real axis. The opinio becomes a rating by discretis- 
ing and confinig it to the ratings scale. Further on, we as- 
sume that user's original opinions when it comes to rating 
are already arithmetic averages of the expressed opinions 
of peers, opinions of professional critics and possibly the 
existing average (similar to the approach in ((3ll)). This 
implies that limit theorems for sums of random variables 
play a role. 

2 Empirical rating distributions and a simple 
model 

The aim of this paper is to characterise the distribution of 
ratings towards a certain movie when the rating histogram 
contains a lot of ratings. For a first analysis of the question 
some histograms of movie rating have been collected (j32j) 

A brief inspection of a couple of histograms reveals the 
following picture: Almost every histogram has either two 
or three peaks. (A 'peak' is a bin where all neighbour bins 
are less in size. It is a local maximum (or mode) of the 
probability mass function of the distribution.) In the case 
of two peaks at least one is at 1* or 10*.. In the case of 
three peaks one is at 1* and one at 10*. The histogram at 
the central bins 2*, . . . , 9* has a 'Gaussian- function like' 
shape with a peak and exponentially looking decay. This 
gives rise to the idea that the histogram is a discretised 
version of a probability density function on the real axis 
which is confined to the interval of possible ratings. Specif- 
ically, we consider the opinion about a movie from cinema- 
goers to be a real-valued random variable which is some- 
how distributed. When it comes to assign stars the voter 
has to discretise her opinion to the bins 1*, . . . , 10*. Nat- 
urally, the voter would discretise according to the intervals 
] - oo,1.5],]1.5,2.5],...,]8.5,9.5],]9.5,+oo[. If all voters 
draw their vote from the same distribution the histogram 
will have bins with masses proportional to the integrals of 
the probability density function (pdf ) of that distribution 
over the above intervals. Figure [T] shows how a continuous 
distribution is confined and discretised to a probability 
mass function on 1*, . . . , 10*. 

The question is now: What is this distribution and 
how universal can it be parameterised? Before trying to 
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pdf for Levy skewa-stable S(a,p,y,u.;1) 
a=1.3202 p=0.045661 y=1.2454 u=7.7312 



pdf for Levy skewa-stable S(tx,p,y,u.;1) 
a=1 .2669 p=-0.000731 42 7=1.1933 ^=9.381 7 
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rating histogram "I Am Legend" (2007) 
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rating histogram "Pulp Fiction" (1994) 
* is S(a,p,y,|i;1)-pdf scaled with #votes 
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Fig. 1. Explanation of confined Levy skew a-stable distributions transformed to rating histograms. Shown are best fits for two 
examples movies. 



answer this question by looking at the data we formulate a 
simple social theory which limits the possible distributions 
to 'Gaussian-like' shapes. 

It is natural to assume that people make their mind 
about a movie not independent of the opinions of others. 
Each cinemagoer might adjust her initial impression to- 
wards the opinions of others, towards the existing mean 
rating or towards ratings of professional critics. This is 
modelled by taking an average of several opinions as the 
final opinion of a cinemagoer. Here, several aspects might 
be important like social networks including correlations of 
links and initial impressions, opinion leaders, timing ef- 
fects and so on. But if we assume that initial impressions 
are drawn from a random variable with finite variance, 
averaging of a large enough number of opinions implies a 
distribution of averaged opinions close to a normal distri- 
bution due to the central limit theorem. This holds also 
when individual random variables are different under some 
additional mild assumptions. Also for contrasting forces 
like 'if I observe the average to be ljr higher then my 
opinion, I lower my opinion the limit theorem holds, 
as long as the forces are linear. According to this theory 
of opinion making the histogram of ratings should be a 
discretised and confined probability density function of a 
normal distribution. The normal distribution does not fit 
well, as it will turn out. Either the highest peak is not 
achieved or the decay of bin size with distance from the 
highest peak is too fast. 



Alternatively, we might assume, that initial impres- 
sions are drawn from fat-tailed distributions. This implies 
that distributions do not have a finite variance. The prob- 
ability of extreme initial impressions might not vanish ex- 
ponentially but as a power law with exponent — (1 + a). 
If this is the case a generalisation of the central limit the- 
orem says that an average of these random variables has 
a distribution close to a Levy skew a-stable distribution 
(the parameter a must indeed be universal for this theo- 
rem). So, we can keep the theory of averaging, but extend 
from the normal distribution to the wider class of Levy 
skew a-stable distributions. 

The Levy skew alpha-stable distributions are the only 
stable distributions (see (33)). It has four parameters a, ft, 7, ^ 
and is abbreviated S(a, ft, 7, fj,). (There are several parametri- 
sations of the Levy skew a-stable distribution. The one 
used here is S(a, ft, 7, 1) as explained in HH).) Its prob- 
ability density function is 

1 f +0 ° 

/s(a,/3, 7 ,M)( x ) = IT / V>(t;a, ft,j, fj,)e ztx dt (1) 
^ 7r J -00 

with tp(t; a, ft,"/, fJ.) being its characteristic function given 

by 

<p(t;a,ft,j,vL) =exp[ M -|7i| Q (l-i/3sign(^) ]. (2) 

and <P = tan(^) if a ^ 1 and # = -f log|t|. The 
four parameters are a e]0,2], , ft[— 1, 1], 7 G [0, 00 [, and 
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fx e] — oo,oo[. The first two parameters are shape pa- 
rameters, where a represents the peakedness and (3 the 
skewness; fx and 7 are location and scale parameters. (But 
notice that is not the skewness in terms of the third 
moment, and a is not the peakedness in terms of kurto- 
sis.) For a > 1, fx also represents the mean of the distri- 
bution (otherwise not defined) . Figure [2] shows how the 
parameters modify the shape of the probability density 
function. Small a represents a sharp peak but heavy tails 
which asymptotically decay as power laws with exponent 
— (a + 1). Maximal a = 2 is the normal distribution with 
exponential decay at the tails. Scale parameter 7 corre- 
sponds to the variance a 2 by the relation a 2 = 2j 2 only for 
a = 2. For lower a the variance is infinite. Skewness (3 = 
gives a distribution symmetric around the mean, a positive 
(3 implies a heavier left tail, a negative [3 a heavier right 
tail, but with the same decay on both sides. Only in the 
case (3 — ±1 one tail vanishes completely. If a — 2 then (3 
has no effect. Only the special cases of the normal distribu- 
tion (5(2, 0, a, /x)), the Cauchy distribution (5(1,0,7,5)) 
and the Levy distribution (5(^, 1,7,5)) have closed form 
expressions. 

In the following the Levy skew a-stable distributions 
(discretised and confined) will be fitted for each empirical 
rating distributions in the data set. 



3 Fitted Levy skew a-stable distributions 

Fitting is done by minimising least squares of the differ- 
ence of the normalised empirical rating histogram to the 
confined and discretised probability density function of 
Levy skew a-stable distributions with parameters (a, f3, 7, fx) 
(For numerical reasons, fitting has been done with a dif- 
ferent parameterisation S(a, f3, 7, 5; 0) (see (f33l)). The pa- 
rameters a, /?, 7 are equal to the former parameterisation 
and (J, = 5 — /?7tan(7ra/2).) Computation was performed 
as follows: The values of the probability density function 
fs(a,(3,-y,ti)(x) are computed for x = — 20, — 19, . . . , 29, 30 
by computing and integrating the characteristic function 
(Eq. on( = -20.005, +. .? 1 : 20.005. Then values for 
x = —20, . . . , 1 are summed up and set on bin 1 and val- 
ues for x = 10, . . . , 30 are summed up and set on bin 10. 
This produces a probability mass function on 1, ...,10 
for (a, j3, 7, jit). Results were reasonably good, the missing 
mass of the tails (below -20 and above +30) was mostly 



below 0.3%. The fitting was computed by minimising the 
squares of distances of the probability mass function for 
S(a, /3, 7, fx) to the normalised empirical rating distribu- 
tion. The minima were found with the matlab-function 
f minsearch. The search converged in 1081 cases (99.5%) 
the remaining cases it terminated by maximum number 
of iterations. Finding a global minimum is not guaranteed 
by this method, but results looked convincing. (Experi- 
mentally, some fits have been computed via minimising 
by gradient descent. It lead to very similar fits.) We refer 
to this fit as fit(o;, /3, 7, fx) Examples of fits are shown in 
Figure Q] 

Table [T] shows the mean values of fitted parameters 
over all movies as well as goodness-of-fit measures. The 

sum of squared error (SSE = YnLi( r i ~ /s(a, / 3, 7 , AI )(*)) 2 

with Ti being the fraction of ratings for is on average 

very small, the coefficient of determination R 2 is on aver- 
ts QT7 1 

age almost one. (R 2 = 1 — „ lu , , with r; the frac- 

tion of ratings for i-Jf (therefore (r) = 0.1).) Both reflects 
that indeed most fits also look impressively close to the 
empirical histogram. Further on, a Kolmogorov-Smirnov 
test has been performed for each movie. (Done with the 
matlab-function kstest2 on the vector of all ratings and 
a vector with the same number of ratings as expected ac- 
cording to the fit.) With level of significance 0.05 the null 
hypothesis that the expected fitted distribution and the 
empirical histogram are drawn from the same distribu- 
tion could not be rejected for 68.7% of the movies. The 
Kolmogorov-Smirnov test is very hard, it rejects the null 
hypothesis very likely for large samplesizes. Given the high 
number of ratings (> 20, 000) for each movie this rate is 
still impressive. But it is also clear that Levy skew a-stable 
'cannot fully explain all possible rating histograms. 

For comparison Table [1] also contains mean values for 
a fit with normal distributions 5(2, 0, 7, fx). The goodness- 
of-fit parameters are worse. This is natural because there 
are less free parameters, but clearly the normal distribu- 
tion is ruled out as an appropriate candidate. 

Figure[3]shows the parameters of best fits for all movies 
as scatter plots. All four subplots show \x at the abscissa. 
Dark points indicate movies which fits have a small sum 
of squared errors (SSE), red stars indicate medium SSE, 
and yellow stars indicate bad fits with high SSE. 

The first plot shows fx with respect to the original av- 
erage of ratings. It shows that fi is spread wider than the 
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(ratings) 


corr-coef 


<fit(a,/3,7,At)> 


<fit(2,0,7,M)> 


(fitoo) 




mean 


7.3464 


0.5772 


7.6326 


7.6590 


7.5862 




std 


1.9669 


0.8883 


1.2021 


1.2456 


1.1993 


7 


skewness 


-1.0610 


-0.2829 


0.0159 





-0.0114 


P 


kurtosis 


1.8581 


-0.0138 


1.3261 


2 


4 


a 








0.0002 


0.0035 


0.0035 


SSE 








0.9965 


0.9404 


0.9434 


R 2 








68.7% 


0% 


4.5% 


K-S / 



Table 1. Aggregated measures on the data set and on three confined Levy skew a-stable fits (fit(a, (3, j, /i), fit (2, 0, 7, fit(/x)). 
Mean, standard deviation, skewness, and kurtosis are computed for the histograms of each movie. Parameters of fits are also 
computed for each movie. The table shows the mean values for all 1,086 movies. The correlation coefficient is computed for 
the 'analog' measures for the ratings and fit(a, f3, 7, /i). The low (and negative) correlation skewness vs. f3 and kurtosis vs. a 
show that the parameters of the fit deliver information on the distribution which is not extracted by the 'standard measures' 
on the raw data. Goodness-of-fit parameter are computed for each fit for each movie. The mean over all movies is shown for 
sum of squared error (SSE) and coefficient of determination R-square. For the Kolomogorov-Smirnov test (K-S) the rate of not 
rejecting of the hypothesis of a common distribution of ratings and fitted distribution is given. 




Fig. 3. Parameters 
modifies peakedness 



of best fit for confined Levy skew a-stable distributions for all movies. \x is the mean of the distribution, a 
and tail exponent, f3 skewness, and 7 scales how broad the distribution is. 
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original average. So, \i can serve as a measure for movie 
quality which differentiates better than the original aver- 
age. 

The remaining three subplots show the relations of /i 
to the other three parameters a, (3, 7 of the best Levy skew 
a-stable fits. The blue dots in the two bottom plots show 
the averages of the ordinate values within the /i-region 
marked by the grid lines. The green lines represents a = |, 
for (3 the best linear fit for the blue dots, and the best 
quadratic fit for 7. The plots show that the peakedness 
a concentrates to values between 1.2 and 1.5, which is 
clearly not normal distributed. The average value is (a) = 
1.3261 » |. For the skewness (3 there is a clear trend with 
respect to fi. Interestingly, (3 = is most likely almost 
exactly at fi = 7.6326 which is equal to (^). For better 
movies there is an additional positive skewness (meaning 
that the right tail is fatter) . Respectively, for movies worse 
than (/i) there is additional negative skewness (meaning 
that the left tail is fatter). For the scale parameter there is 
also a clear trend visible. The most narrow distribution is 
achieved also almost exactly for movies with /1 = (/i) . For 
better and worse movies the distributions get broader. 

It is not apriori clear and thus remarkable that (n) 
plays a central role for the deviations in (3 and 7 with re- 
spect to /i. This gives rise to the speculation that (/i) is 
kind of universal modulo the scale of ratings (here L. . . , 10) 
This is underpinned by the finding of (author?) (26) that 
users rate consistently in different rating schemes. If we 
rescale (//) = 7.6326 to the scale 1, ... ,5 we get 4.0633 
which coincides almost exactly with 4.07 which is the av- 
erage mean rating of books averaged over all books in the 
amazon. corn-sample of (author?) (f27h . Rescale is done 
under the assumption that each rating stands for a bin 
centred on the rating with width equal to the distance of 
successive ratings (here 1). Thus a 10"A"-rating r w is con- 
verted to the 5*-rating by r 5 = 5( nfl T p 2 ^) + 0.5. This en- 
sures for examples that 1-^r in a 5-Ar-rating corresponds to 
1.5-^" stars in a 10-^r-rating, respectively 5-^- corresponds 
to 9.5-A - . It does not coincide as good with 3.44 which was 
found by (|29l ) for movies . yahoo . corn-data. The deviation 
may come from two differences: First, in ((271 ) and in this 
study the average reported is the average of the average 
ratings of movies, while ([29]) reports the the pure average 
rating over all ratings in the database. Second, (|27l ) and 
this study select books respectively movies similar: this 
study by all having more than 20,000 ratings, and (|27r ) by 
being on a bestseller list and having a sufficient number of 
reviews. Both sampling method imply a similar selection 
bias which is different from (|20| ) which collects all movies 
released in 2002. 

Taking this speculation as true it means that an av- 
erage movie receives an average vote of about 0.71 on a 
generalised scale [0,1]. This indicates a universal strong 
positive bias for the average movie. The strong positive 
bias may be implied by an overall selection-bias, that user 
select movies or products they are likely to like or even 
they like movies and products just because they paid for 
them. Contrasting, a negative bias is reported on ratings 
for jokes in (plh . Following the results of fit(a, f3, 7, fx) we 



further conclude that the distribution of ratings for an 
average movie has no skewness {(3 = 0) and the smallest 
scale parameter (here 7 = 1.1). If a movie deviates from 
average this implies higher deviations in the distribution 
(7 > 1.1) and a skewness which pronounces the deviation 
from the average movie. The latter observation can be re- 
garded as a hint for a socially implied positive feedback 
in determining opinions on movies which quality is above 
(or below) an average movie. 

Taking the trends displayed by the green lines in Fig- 
ure [3] one can construct a one-parameter fit on fi with a = 
|, (3 determined by the linear fit and 7 by the quadratic 
fit. The equations to compute /?, 7 from fj, are (3 = 61/1 + 62 
and 7 = ci /z 2 + C2/i + C3 with parameters 61 = 0.11 78, 62 = 
-0.9049, ci = 0.05342, c 2 = -0.8388, c 3 = 4.401. We refer 
to this fit as fit(^t). Mean values and mean goodness-of-fit 
measures are also shown in Table [TJ The one-parameter fit 
gets better goodness than the two parameter fit(2, 0, 7, fi). 

Figure 3] shows how fit(/x) is able to approximate em- 
pirical histograms. The shape of empirical distributions is 
well captured but variations for different movies are big 
enough to conclude that fit(/i) can only be seen as a base- 
line case. Movies can have some individual characteristics 
of their rating distribution which go beyond the quality 
(captured in [i). Deviations from the baseline case can be 
used to classify movies in a new way to understand what 
the cause of deviations might be. This is a task for further 
research. 

Finally Figure [5] shows a comparison of theoretical his- 
tograms of fit(/i) and the average empirical histograms. 
The theoretical histograms are for /i = 5, +? . 5 , 9 and the 
average empirical histograms are over all movies with fit- 
ted value of \i within the intervals fi S [4.75, 5.25], [5.25, 5.75], 
. . ., [8.25,8.75], [8.75,9.25]. The similarity underpins that 
fit(/i) can really serve as a good baseline case. But some 
deviations from the baseline case seem to be not totally 
random. E.g. the residuals show that the size of the 1-fc 
bin for low quality movies (/z < 7) is on average predicted 
too high, while the 2-fc, 3-A" bins are on average predicted 
too low. 



4 Conclusion 

With some success rating histograms were fitted to con- 
fined Levy skew a-stable distributions. This clearly demon- 
strates that the assumptions that opinions are normally 
distributed, beta distributed or U-shaped around the qual- 
ity of the movie is not valid. Some histograms have of 
course a U-shaped (or better J-shaped) form, e.g. right- 
hand side in Figure [TJ But a U-shape can not approxiamte 
all histograms, e.g. left-hand side of Figure [TJ 

If the assumption that expressed opinions of users are 
weighted averages of formerly expressed opinions of others 
this implies that these opinions must come from distribu- 
tions with fat tails with a power-law exponent of about 1.2 
to 1.5 to get good fits. Further on, the scale and skewness 
parameter of the best fits change systematically with the 
deviation of its mean from the mean of an average movie 
(with /1 7.6). A movie better than average shows right 



Jan Lorenz: Universality in movie rating distributions 



7 



30% 



20% 



10% 



|i-fit and histograms movies around n=6 |x-fit and histograms movies around |i=7.5 jj.— fit and histograms movies around |i=9 

30% 




30% 



20% 



10% 




20% 



10% 




123456789 10 123456789 10 123456789 10 

ratings ratings ratings 

Fig. 4. The theoretical probability mass functions according to the parameters of fit(/i) for fj, = 6, 7.5, 9 and all empirical 
histograms which received best fitted values for fx € [5.98, 6.02], [7.48, 7.52], [8.98, 9.02]. 



Levy skew a-stable pdf of ir-fit 



average rating histograms 



30% 










/ 


20% 
















\\\\\ 


10% 




\0\\\ 



30% 



20% 



10% 



2 3 4 5 6 7 
votes 



8 9 10 




2 3 4 5 6 7 
votes 



8 9 10 



residuals: average ratings minus fitted pdf 
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of both. The stars mark the underlying /^-values of curves. Colors are the same in all plots. 



skewness and a larger scale parameter. A movie worse than 
average shows left skewness and a also a larger scale pa- 
rameter. Thus, better movies have also a heavier tail on 
the better side and worse movies have a heavier tail on 
the worse side. In general, distributions get broader when 
deviating from the mean. Both observations seem plausi- 
ble from a sociological point of view. The new measures 
of skewness (/?) and peakedness (a) are not the same as 
the classical skewness and excess kurtosis which are com- 
pute directly from the sample data (see Table [T]). There is 
no correlation of both measures, or even a negative one. 
This underpins, that fitting rating histograms as confined 
distributions really delivers a new characterisation. A fur- 
ther advantage of this approach is, that the Levy skew en- 
stable distribution defines a distribution completely, which 



mean, standard deviation, skewness and excess kurtosis do 
not. 

A one-parameter fit based on this observations shows 
to approximate the data well, but is not able to establish 
a strict characterisation of movie histograms. Deviations 
from the constructed baseline case are not neglectable. 
Nevertheless, it could be useful to characterise movies by 
their deviation from their baseline case. Further on, there 
might is a selection bias in the data, because only movies 
with a large number of ratings were selected. The fit might 
not work for less rated movies. The method might be used 
to detect attacks of enthusiastic fans (or movie companies) 
which try to rate movies up. 

There seems to be some universality in movie rating 
distributions, which may be implied by people adjusting 
their opinions with peers and other sources of opinions. 
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Clearly, other theories which may imply other underlying 
distributions need to be developed and checked against 
data and also this theory needs to be checked against data 
from other sources to clarify universality in continuous 
opinion dynamics about taste. 
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